Neuromorphic device based on silicon nanosheets

Silicon is vital for its high abundance, vast production, and perfect compatibility with the well-established CMOS processing industry. Recently, artificially stacked layered 2D structures have gained tremendous attention via fine-tuning properties for electronic devices. This article presents neuromorphic devices based on silicon nanosheets that are chemically exfoliated and surface-modified, enabling self-assembly into hierarchical stacking structures. The device functionality can be switched between a unipolar memristor and a feasibly reset-able synaptic device. The memory function of the device is based on the charge storage in the partially oxidized SiNS stacks followed by the discharge activated by the electric field at the Au-Si Schottky interface, as verified in both experimental and theoretical means. This work further inspired elegant neuromorphic computation models for digit recognition and noise filtration. Ultimately, it brings silicon - the most established semiconductor - back to the forefront for next-generation computations.

. Due to the chemical exfoliation method for the preparation of SiNSs, the presence of some impurity in the active material is inevitable, which might result in the small shoulder in the Tauc plot, often seen for many semiconductor nanomaterials chemically synthesized. Moreover, we think the diffuse reflection of the film also contributed to the shoulder. As the film was not mirrorlike smooth, the diffuse reflection led to the discrepancy, when we used the absorption spectrum for calculation. Tauc plot calculated by refection spectrum counted some of the reflected photons. The Tauc plot calculated by reflection spectra possessed a much less significant shoulder effect, compared to the absorption spectrum ( Figure S6(b)). Nevertheless, due to the ineluctable scattering and intragap states, the shoulder cannot be fully eliminated. The 15 μm thick channels of partial oxidized hierarchical stacking SiNSs bundles could be regarded as a large number of parallel capacitors, which contributed to large gross capacitance and the carrier storage ability. Subsequently, we took advantage of the capacitance to fabricate the synaptic device. The channel thickness of 15 μm was chosen within an optimal range and with consideration of the vertical potential distribution in an equivalent circuit, for a proper match of the capacitance and resistance. Especially, the larger capacitance contributed to the strong charge storage ability and good synaptic behavior.
We fabricated devices with different channel thicknesses, to investigate the impact of channel thickness. The device with a thinner channel (5.8 μm, Figure S7 (a)) had a similar I-V hysteresis loop and synaptic response, Figure S7 (b-c). However, the weight change was less significant compared to the devices with a thicker channel mentioned in the manuscript. Also, the large resistance led to the revealing of contact capacitance between the probe and devices, resulting in large discharge spikes at the rising edge and larger measurement errors. The thicker channel contributes to a larger capacitance and smaller resistance according to Equation S1~2 (simple linear resistance and parallel capacitor model were applied for estimation). The capacitance is proportional to the device thickness, and resistance is inversely proportional to thickness. Therefore, a larger thickness was preferred.
= device bundle ε 0 ε r S device d bundle While too thick channel took a large amount of SiNSs and might be unfeasible for large-scale device fabrication. Also, we considered devices' vertical electric field distribution like the fringe effect of electrode and nonideality of the parallel capacitance model. The vertical field effect is more significant in thicker devices.
For the present lateral device layout, the curvature and vertical distribution of the electric field were unavoidable for the present lateral layout. Therefore, the discrepancy between the extracted capacitance parameters from the model (7 × 10 -7 F), and the calculated one by device dimension (1 × 10 -7 F) could result from the vertical electric field distribution. The simple models (Equation S1~2) were more like the qualitative analysis. And the actual device fabrication and performance measurement were necessary. The attachment of organic modifier and hierarchical structure of SiNSs retained after the thermal treatment at 200 °C. We provided the XRD and SEM results of SiNSs ( Figure S8(a-b)). After 200 °C treatment, the four XRD peaks were retained, which indicated the SiNSs structure did not change after thermal treatment ( Figure S8(a)). The SEM demonstrated a device cross-section structure of active SiNSs layers ( Figure S8(b)). The TEM of SiNSs treated under 200 °C ( Figure S8(c)) revealed the stacking of lamellar layers of SiNSs, which confirmed the hierarchical structure proposed in the manuscript. Few layers first stick to each other and form bundles, while there were many interval gaps between bundles. The hierarchical structure broke the long-range order of stack, which resulted in no peak observed in XRD attributed to the stacking structure. The poor arrangement of SiNSs and the remaining base solvent (amine) could stimulate oxidation. Therefore, the higher thermal treatment temperature above the boiling point (187 °C) of the solvent (pFA) in a N 2 -filled glove box improved the arrangement of SiNSs and eliminated the solvent. Although the oxidation of SiNSs could not be fully prevented, it was ameliorated when increasing the thermal treatment temperature. The thermal-treatment-induced elimination of oxidation was proved by UPS. The channels of the devices treated at 150 °C and 200 °C were characterized by UPS. We found that both of the devices had a shoulder for partial oxidation. Compared with the 150 °C, the device treated at 200 °C has less significant shoulder, which indicated the thermal treatment at a higher temperature can reduce the oxidation. The derivation of the formula for the quasi-static curve of the device: ds = NSs + con = NSs NSs + ds con NSs + NSs = ds ds ( NSs + con ) − NSs con NSs = NSs NSs NSs NSs = NSs ds ( NSs + con ) − NSs NSs con NSs = ds − ds con ds ( NSs + con ) − NSs ( ds − ds con ) con ds ( ) + NSs + con Boundary condition: lim * →− 0 because the hysteresis curve is closed and exhibits central symmetry.
We estimated the capacitance according to the device dimension. Notably, according to parameter fitting, the device capacitance was fairly large within a small channel (100 μm * 1000 μm). Whereas, using the parallel plate capacitor model (equation S1), the capacitance was ~0.2 pF, which is distinctively smaller than the capacitance from the fitting.
where ε 0 is the vacuum permittivity, ε r is the relative permittivity of the active layer, S is the area of the parallel plane, and d is the thickness.
We attributed the large capacitance to connected capacitors of layered partially oxidized SiNSs bundles, which were both vertically and horizontally attached to each other in the channel. The outer shells of bundles converted into SiO 2 due to the partial oxidation, which contributed to the capacitance. While the inner cores remained intact, which contributed to the conduction. Thus, considering the dimension of the channel and the SiNSs to estimate the number of bundles (n bundle ) , the device capacitance was better estimated as ~1 * 10 -7 F (equation S2), which was close to the previously extrapolated result from fitting.  We observed a sudden large leakage current for the commercial electronic components when the current was reduced to 0, which is different from the device we fabricated. This phenomenon was due to the fact that discrete diode components change to the flat band mode. Then, the electron stored in the capacitor could easily deplete through the diode ( Figure S10 (d)). Whereas, for the Si-NSs device, the Schottky junction could never be flat band mode. The Schottky diode was formed by SiNSs, and some carriers could be stored at the junction which was attributed to the tilt of the band. Thus, the carriers in the SiNSs could not get back to the electrode ( Figure S10 (c)). The first cycle was different from subsequent cycles in that the threshold voltage for switching from LRS to HRS of the first cycle was smaller. This was due to the fact that during the measurement interval (Figure 2(h)), the charge stored in the C NSs from the former measurement transferred to the Schottky junction. Due to the stored carriers in the C NSs , the voltage (v NSs ) across the SiNSs stacks (C NSs and R NSs ) would not suddenly disappear. During the measurement interval without applied voltage (v ds ), existing v NSs caused the carrier drift and redistribution ( Figure S12(a)). The holes transferred to the electrode without hindrance, while the Schottky barrier impeded the movement of electrons. Driven by the electric field, electrons accumulated at one side of the Schottky junction, which had higher electric potential. To balance the negative charges of accumulated electrons at the junction, the space charge region with positive charges enlarged. The redistribution of carrier impacted the bandgap and electric field near the junctions. The larger electric field facilitated the breakdown under a lower applied voltage ( Figure S12(b)). Therefore, the NDR effect and charging of the CNSs happened at a lower voltage for the first cycle. The redistribution prevented the complete varnishing of stored information in the memristor but led to the difference between the first cycle and subsequent cycles. For synapses, the synaptic weight refers to the signal intensity transmitted from the presynaptic to the postsynaptic neuron. In nervous systems, the synaptic weight would be automatically tuned by time intervals and rates of the spikes passing through the synapses. These processes are believed highly connected to learning and memory. Our devices could mimic these fundamental functions.
For our devices, the applied voltage V ds on the drain was the input signal, and the channel current I ds of devices could be referred to as the synaptic weight. The current at the falling edge of the voltage spikes was measured to avoid the non-ideality at the rising edge (the capacitance between test probe and devices could lead to irregular current gain). The applied spike voltage was as low as 1 V to ensure low energy consumption of the device per spike (~10 -8 J), which is an important merit of neuromorphic computation. Also, under low voltage, the explicit synaptic behaviors and could be observed ( Figure S12 (a-c)) without large contact resistance variation. When the applied voltage was large, the NDR effect could cause the abnormal PPF for the first several spikes( Figure S12 (d)). Meanwhile, the duration and interval of the spike were all set as long as 1 s, considering the limited measurement precision.
For the simulation, we tuned the value of C NSs to 1 * 10 -7 F for a better match. We found the boundary condition could be largely varied due to the actual non-constant r con , which significantly impacted the fitted value of C NSs discussed in the quasi-static I-V measurement. After adjusting C NSs , the simulated I-V loop also corresponded well with the experimental result (Figure 2(f)). In the manuscript, we mainly focused on the discussion of the mechanism of the devices. The energy efficiency and function speed are surely critical parameters of neuromorphic devices. For these two parameters, our device was far behind those of the state-of-the-art devices (e.g. Zero-static power MoS 2 atomristors enable high-frequency switches that operate at 50 GHz 2 ). Therefore, one of the aims of our ongoing work is to develop new and specific synthesis to modify the materials to fabricate the device with better performance. Figure R6 demonstrated the device using the modified SiNSs had similar I-V and synaptic characteristics as the devices mentioned in the manuscript. Meanwhile, it possessed much lower energy consumption (~0.3 pJ (= 0.3 s * 0.01 V * 10 -10 A)), with a higher working frequency (5 Hz), which might potentially support SiNSs' further realistic application. Nevertheless, since the relevance of improvement of work frequency and power consumption is comparatively less significant to the focus of this work, we intend to report these details separately in future work.  LTM corresponds to the ability to maintain the synaptic weight, which can be referred to as synaptic plasticity involved in learning and memory activities. The following is the derivation of the formula for the LTM: ds = NSs + con = NSs NSs + ds con = NSs NSs + � NSs + NSs � con Let NSs ( ) = NSs �(2 + 1) spk � ds ( ) = con = ds − NSs ( )
The following is the derivation of the formula for the STDP: ds = NSs + con = NSs NSs + ds con = NSs NSs + � NSs + NSs � con   The output current figure of the S1-1, S1-2, S2-1, S2-2 devices, respectively. The red spikes correspond to the excitation spike, the blue ones correspond to the inhibition spikes, and the black ones correspond to the examineing spikes.
We included a proof of concept demonstration of the actual device application, to verify the feasibility of the construction of scalable neuromorphic networks. We used 4 devices, which served as synapses that connected the inputs and outputs, to build the 2-input-and-2-output binary classification network. This network is the simplest classification network while sharing the same operation principle as the network we put forward in the manuscript for digit classification.
We applied the second quadrant operation pattern of the device, to demonstrate the functionality of the network. The negative voltage pulses were applied to increase the synaptic weight as excitation spikes, and the positive pulses were applied to examine if the output current of the device was over the threshold to emit an output signal as examine spikes or decrease the synaptic weight as inhabitation spikes.
In the case of the input sequence of "1, 0", i.e., I1 emitted high-frequency spikes while I2 emitted low-frequency spikes. Upon the examining pulses, we found S1-1 has the highest increasing synaptic current due to the residual weight differences or moderate device-to-device differences. In this case, the S1-1 reached the threshold (2.5 nA, labeled as red dash lines) firstly and fired O1, while O2 was inhibited according to the winner-take-all strategy. Afterward, the additional inhibition positive pulses were added to the synapse linked to O2 (S1-2 and S2-2). The inhibition pulses made the synaptic weight of S1-2 lower than S1-1 and even unlikely to reach the threshold. As for S2-1 and S2-2, the low-frequency input signal was insufficient to excite the device. Therefore, only S1-1 was enhanced and the connection between I1 and O1 continued to be enhanced, while other synaptic devices were inhibited. The O1 was then assigned to the sequences "1, 0".
For the opposite input sequence of "0, 1", i.e. the I2 emitted high-frequency spikes while I1 emitted low-frequency spikes, the opposite result could be expected. While occasionally fault classification occurred, i.e. high frequency of I1 and I2 were all related to O1. This could be due to the lack of redundancy of output neurons. Similar to the digital classification network with 300 output neurons vs 10 digits mentioned in the manuscript, the redundancy of output neurons took advantage of the device-to-device differences. With redundancy of the devices, a better-connected device can reach the threshold easier, and result in the correct classification.
Notice that, our device-based network used STDP algorithm, which was trained continuously while doing the recognition at the same time. For evaluation of the training process of this SNN, we deem it to be completed when the network is stable, and almost all the inputs can be separated into several fixed output categories without further changing as the training time prolongs. Further, for the recognition, the synapse weights can still be modified when receiving stimulus. The devices reached the stable state after one iteration of training, as the demonstrative inputs are comparably simple.